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Abstract 

Complex networks have recently attracted much attention in diverse areas of science and 
technology. Many networks such as the WWW and biological networks are known to display 
spatial heterogeneity which can be characterized by their fractal dimensions. Multifractal 
analysis is a useful way to systematically describe the spatial heterogeneity of both theoretical 
and experimental fractal patterns. In this paper, we propose a new box covering algorithm 
for multifractal analysis of complex networks. This algorithm is used to calculate the gener- 
alized fractal dimensions Dq of some theoretical networks, namely scale-free networks, small 
world networks and random networks, and one kind of real networks, namely protein-protein 
interaction (PPI) networks of different species. Our numerical results indicate the existence 
of multifractality in scale-free networks and PPI networks, while the multifractal behavior is 
not clear-cut for small world networks and random networks. The possible variation of Dq 
due to changes in the parameters of the theoretical network models is also discussed. 
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1 Introduction 

Complex networks have been studied extensively due to their relevance to many real-world 
systems such as the world-wide web, the internet, energy landscapes, and biological and social 
systems [Ij. 

It has been shown that many real complex networks share distinct characteristics that differ 
in many ways from random and regular networks [21 [3]. Three fundamental properties of real 
complex networks have attracted much attention recently: the small- world property [H[5], the 
scale- free property [6-8], and the self-similarity JT]. The small- world property means that the 
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average shortest path length between vertices in the network is short, usually scaling logarith- 
mically with the size of the network [3]. A famous example is the so-called six degrees of 
separation in social networks [5j. A large number of real networks are referred to as scale-free 
because the probability distribution P{k) of the number of links per node (also known as the 
degree distribution) satisfies a power law P{k) ~ k~'^ with the degree exponent 7 varying in 
the range 2 < 7 < 3 [6J. In view of their small- world property, it was believed that complex 
networks are not self-similar under a length-scale transformation. After analyzing a variety of 
real complex networks. Song et al. [Ij found that they consist of self-repeating patterns on all 
length scales, i.e., they have self- similar structures. In order to unfold the self-similar property 
of complex networks. Song et al. [IJ calculated their fractal dimension, a known useful charac- 
teristic of complex fractal sets [9-11], and found that the box-counting method is a proper tool 
for further investigations of network properties. Because a concept of metric on graphs is not 
as straightforward as the Euclidean metric on Euclidean spaces, the computation of the fractal 
dimension of networks via a box-counting approach is much more complicated than the tradi- 
tional box-counting algorithm for fractal sets in Euclidean spaces. Song et al. [12] developed 
a more involved algorithm to calculate the fractal dimension of complex networks. Then Kim 
et al. [13] proposed an improved algorithm by considering the skeleton of networks. Zhou et 
al. [11] proposed an alternative algorithm, based on the edge-covering box counting, to explore 
the self-similarity of complex cellular networks. Later on, a ball-covering approach |15j and an 
approach defined by the scaling property of the volume [3 |TB] were proposed for calculating the 
fractal dimension of complex networks. 

The tools of fractal analysis provide a global description of the heterogeneity of an object, 
such as its fractal dimension. This approach is not adequate when the object may exhibit a 
multifractal behavior. Multifractal analysis is a useful way to systematically characterize the 
spatial heterogeneity of both theoretical and experimental fractal patterns \n\ I18j . It was 
initially proposed to treat turbulence data, and has recently been applied successfully in many 
different fields including time series analysis |19j . financial modelling [20], biological systems 
[21-28] and geophysical systems [29-34]. For complex networks, Lee and Jung [2] found that 
their behaviour is best described by a multifractal approach. As mentioned above, through the 
recent works by Song et al. [T], Guo and Cai |3], Kim et al. [13j, Zhou et al. [H], Gao et al. [15], 
it was already a big step to go from the computation of the fractal dimension of a geometrical 
object to that of a network via the box-counting approach of fractal analysis. In this paper, 
we propose a new box-covering algorithm to compute the generalised fractal dimensions of a 
network. This is the next step to move from fractal analysis to multifractal analysis of complex 
networks. 

We first adapt the random sequential box covering algorithm |13j to calculate the fractal 
dimension of the human protein-protein interaction network as well as that of its skeleton. 



2 



We next propose a box covering algorithm for multifractal analysis of networks in Section 2. 
This algorithm is then used to calculate in Section 3 the generalized fractal dimensions Dq of 
generated examples of three classes of theoretical networks, namely scale-free networks, small- 
world networks and random networks, and one kind of real networks, namely protein-protein 
interaction networks of different species. The methods to generate the theoretical networks are 
described. The multifractal behaviour of these networks based on the computed generalised 
fractal dimensions Dq is then discussed. The possible variation of Dq due to changes in the 
parameters of the theoretical network models is also investigated. Some conclusions are then 
drawn in Section 4. 

2 Methods 

In this section, we first introduce the box covering methods for calculating the fractal 
dimension of complex networks and the traditional fixed-size box counting algorithms used for 
multifractal analysis. We then present our new approach for multifractal analysis of complex 
networks in detail. 

2.1 The box covering methods for calculation of fractal dimension 

Box covering is a basic tool to estimate the fractal dimension of conventional fractal objects 
embedded in the Euclidean space. The Euclidean metric is not relevant for complex networks. 
A more natural metric is the shortest path length between two nodes, which is defined as the 
number of edges in a shortest path connecting them. Shortest paths play an important role in 
the transport and communication within a network. It is useful to represent all the shortest path 
lengths of a network as a matrix D in which the entry dij is the length of the shortest path from 
node i to node j. The maximum value in the matrix D is called the network diameter, which 
is the longest path between any two nodes in the network. Song et al. [IJ studied the fractality 
and self-similarity of complex networks by using box covering techniques. They proposed several 
possible box covering algorithms [T] and applied them to a number of models and real-world 
networks. Kim et al. [13] introduced another method called the random sequential box covering 
method, which can be described as follows: 

For a given network, let Nb be the number of boxes of radius tb which are needed to cover 
the entire network. The fractal dimension (i^ is then given by 

Nb ~ TB^'^"- 

By measuring the distribution of A'^^ for different box sizes, the fractal dimension dB can be 
obtained by power law fitting of the distribution. This algorithm has the following steps |13] : 
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(i) Select a node randomly at each step; this node serves as a seed which will be the center 
of a box. 

(ii) Search the network by distance rs from the seed and cover all nodes which are found but 
have not been covered yet. Assign the newly covered nodes to the new box. If no newly 
covered nodes have been found, then this box is discarded. 

(iii) Repeat (i) and (ii) until all nodes in the network have been assigned to their respective 
boxes. 

To obtain the skeleton of a complex network, we firstly need to calculate the edge betweenness 
of all the edges in this network. The betweenness 5j, also referred to as load |13) . is defined as 

b,= y 

j,k€N,j^k ^'^ 

where N is the number of nodes, Tij^ is the number of shortest paths connecting nodes j and 
k, while njk{i) is the number of shortest paths connecting nodes j and k and passing through 
edge i. Similar to a minimum spanning tree, a skeleton is constructed so that edges which have 
the highest betweenness and do not form loops are selected [13J. The remaining edges in the 
original network are referred to as shortcuts that contribute to loop formation. In other words, 
the distance between any two nodes in the original network may increase in the skeleton. For 
example, in the human protein-protein interaction network, the largest distance between any 
two nodes in the original network is 21 while the largest distance between any two nodes in its 
skeleton is 27. 

As an example, we used the above algorithm to estimate the fractal dimension of the human 
protein-protein interaction network as well as that of its skeleton. The result is shown in Fig. 
1. When we applied the box covering algorithm on the skeleton, more boxes were needed for 
each fixed box radius r^. The increasing rate of the number Nb of boxes varies when the size 

of the box increases. More specifically, when is smaller, the number of boxes needed is 
not much different for both the original network and its skeleton; but when is larger, many 
more boxes are needed to cover the skeleton than the original network. 

2.2 Algorithms for multifractal analysis of networks 

Most well-known fractals such as the Cantor set, the Koch curve and the Sierpinski triangle 
are homogeneous since they consist of a geometrical figure repeated on an ever-reduced scale. 
For these objects, the fractal dimension is the same on all scales. However, real-world fractals 
may not be homogeneous; there is rarely an identical motif repeated on all scales. Two objects 
might have the same fractal dimension and yet look completely different. Real- world fractals 
possess rich scaling and self-similarity properties that can change from point to point, thus can 
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have different dimensions at different scales. The present paper investigates these properties on 
complex networks. Especially we develop tools to determine whether they are simple fractals or 
multifractals, and how different two networks could be even though they have the same fractal 
dimension. In other words, we aim to develop an approach for multifractal analysis of complex 
networks. 

The most common algorithm of multifractal analysis is the fixed-size box-counting algo- 
rithm \18\ [22t I25j . For a given probability measure < ^ < 1 with support set S in a metric 
space, we consider the partition sum 

z,(g) = s^(B)^o[Mi?r, (1) 

where g is a real number and the sum runs over all different non-overlapping boxes i? of a given 
size e in a covering of the support E. It follows that Z^{q) ^ and Z^(0) = 1. The mass exponent 
function T(q) of the measure fi is defined by 

Proposition 1 The mass exponent T{q)is an increasing function of q. 

Proof. For qi < q2, it follows from /j, being a probability measure that ^{BiY^ > fj,{Bi)'^^; thus 
Z^{qi) > Z^{q2). Since Ine < when e — )■ 0, the increasing property of r (q) follows. □ 
The generalized fractal dimensions of the measure /i are defined as 

and 

for q=l, where = ^ ^(^B)^Qfi{B) In ii{B). 

Proposition 2 Dq is a decreasing function of q for q ^ 1. 

Proof. Combining Eqs. (2) and (3) yields, for q ^ 1, 

-^InZJq) 
^ Ine 

We need to consider 3 cases: 

(i) For 1 < (7i ^ 52 < oo, we have 



< — - — < — - — < oo (6) 
q2-l qi-1 



and 

< ZM ^ Z,{qi) < 1, 



5 



that is, 

InZ.fe) ^lnZ,(gi) <0. (7) 

From Eqs. (5) - (7), it is seen that ^ In Z^{q) increases as a function of q . Thus Dq decreases 
as a function of q since In e < as e — > 0. 

(ii) For < (7i ^ ^2 < 1; we have 

1 1 

-oo < ^ < -1 

92-1 91 - 1 

and 

1 1 

-\aZ^{q2) ^ -lnZe(g'i). 

92-1 91-1 

Thus Dq decreases function of q in this case. 

(iii) For — cxo < gi ^ (72 < 0, we have 

1 1 

-K r ^ -<0 

92-1 91-1 

and also 

-InZefe) ^ -lnZ,(g'i). 

92 — -L 91 — 1 

Thus Dq also decreases as a function of q in this case. □ 
For every box size e, the number a = also referred to as the Holder exponent, is 

the singularity strength of the box. This exponent may be interpreted as a crowding index of a 
measure of concentration: the greater a is, the smaller is the concentration of the measure, and 
vice versa. For every box size e, the numbers of cells Na{e) in which the Holder exponent a has 
a value within the range [a, q + da] behave like 



-/(") 



The function / (a) signifies the Hausdorff dimension of the subset which has singularity a; that 
is, f{a) characterizes the abundance of cells with Holder exponent a and is called the singularity 
spectrum of the measure. The measure fi is said to be a multifractal measure if its singularity 
spectrum / (a) 7^ for a range of values of a. The singularity spectrum / (a) and the mass 
exponent function r(g) are connected via the Legendre transform: ([9j) 

, . dr (q) 

a{q) = —r^ (8) 

dq 

and 

f{a{q))=qa{q)-T{q), g G M. 

Considering the relationship between the mass exponent function r(g) and the generalized di- 
mension function Dq, the singularity spectrum /(a) contains exactly the same information as 
r(g) and Dq. 
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Lau and Ngai [35] showed in their Proposition 3.4 (page 57) that 

(i) hm Dq = amin; 

g— i-oo 

(ii) Hm Dq = Omax- 

This result together with Proposition 2 and the definition of a multifractal measure given 
above lead to a method to determine the multifractality of a probability measure /u: 

When amin = Omax; ihe function Dq is constant for g 7^ 1 and the measure fi is monofractal. 

When amin 7^ Omax; Dq is a decreasing function of q ^ 1 and the measure fi is multifractal. 
This method is the key element in the next section when we investigate the multifractality of a 
variety of networks. 

The generalized fractal dimensions are estimated through a linear regression of [In Z^{q)]/{q — 
1) against Ine for q ^ 1, and similarly through a linear regression of Zi^^ against Ine for q = 1- 
The value Di is called the information dimension and D2 the correlation dimension, while Dq 
is equal to the Hausdorff dimension. 

For a network, the measure of each box is defined as the ratio of the number of nodes covered 
by the box and the total number of nodes in the network. The fixed-size box-counting algorithm 
of Kim et al. [13] described above could not be used to analyze the multifractal behavior of 
networks directly. Because the method contains a random process of selecting the position of 
the center of each box, this will affect the number of boxes with a fixed size. Especially, if a 
node with large degree (a hub) is randomly chosen, a lot more nodes could be covered, and 
it is an efficient way when we produce box covering. However, if a node with small degree is 
randomly chosen first, few nodes could be covered. As a result, the partition sum defined by 
Eq. (1) will change each time we proceed with box counting. We illustrate this situation in Fig. 
2: We consider a network of eight nodes. In Fig. 2A, for a fixed box size tb = 1, firstly node 
a is chosen as the center of a box and both nodes a and b are covered in the same box colored 
in black. Next, node / is chosen as a center of a box, and nodes b, c, d, e, g are all within 
a distance rs = 1. Since node b has already been covered in the previous step, so nodes c, d, 
e, g, f are covered in the same box colored in blue. In the last step, node g is chosen as the 
center of a box and its neighboring node h is the only one found within a distance = 1 not 
covered yet, so h is the only one covered in a box colored in red. In summary, three boxes are 
needed to cover the entire network. In Fig 2B, for the same fixed box size = 1, firstly node 
h is chosen as the center of a box and both nodes h and g are covered in the same box colored 
in red. Next, node / is chosen as a center of a box, and nodes e, g are all within a distance 
rs = 1. Since node g has already been covered in the previous step, so nodes e, f are covered 
in the same box colored in blue. Next, node d is then chosen as the center of a box and since 
its two neighbors /, g have already been covered, so d is the only one covered in a box colored 
in brown; likewise, node c is chosen and covered alone in the box colored in green. In the last 
step, node a is chosen as a center and both nodes a and b are covered within one box colored 
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in black. In summary, five boxes are needed to cover the entire network. In these two cases of 
Figs. 2A and 2B, the partition sums are different. To avoid this effect, we propose to take the 
average of the partition sums over a large number of times and accordingly modify the original 
fixed-size box-counting algorithm into a new method. To our knowledge, this improvement is the 
first introduced in this approach to analyze the multifractal behavior of complex networks. 

We need to calculate the shortest-path distance matrix for each network and these matrices 
are the input data for fractal and multifractal analyses. We describe the procedure as follows: 

(i) Transform the pairs of edges and nodes in a network into a matrix A^xN, where N is 
the number of nodes of the network. The matrix Ai\ixN is a symmetric matrix where the 
elements Oij = or 1 with aij = 1 when there is an edge between node i and node j, while 
Uij = when there is no edge between them. We define that each node has no edge with 
itself and accordingly an = 0. 

Remark: ^atxA^ could be the input data for calculating the degree distribution and 
characteristic path length to determine whether the network possesses the properties of 
scale- free degree distribution and small- world effect. 

(ii) Compute the shortest path length among all the linked pairs and save these pairs into 
another matrix B^xn ■ 

Remark: In graph theory, calculation of the shortest path is a significant problem and 
there are many algorithms for solving this problem. Here, in our approach, we use Dijk- 
stra's algorithm [36] of the Matlab toolbox. 




After the above steps we could use the matrix B^xN as input data for multifractal analysis 
based on our modified fixed-size box counting algorithm as follows: 

(i) Initially, all the nodes in the network are marked as uncovered and no node has been 
chosen seed or center of a box. 

(ii) According to the number of nodes in the network, set t = 1, 2, ...,T appropriately. Group 
the nodes into T different ordered random sequences. More specifically, in each sequence, 
nodes which will be chosen as seed or center of a box are randomly arrayed. 
Remark: T is the number of random sequences and is also the value over which we take 
the average of the partition sum Zr{q). Here in our study, we set T = 200 for all the 
networks in order to compare. 

(iii) Set the size of the box in the range r G [1, d], where d is the diameter of the network. 
Remark: When r = 1, the nodes covered within the same box must be connected to each 
other directly. When r = d, the entire network could be covered in only one box no matter 
which node was chosen as the center of the box. 
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(iv) For each center of a box, search all the neighbors within distance r and cover all nodes 
which are found but have not been covered yet. 

(v) If no newly covered nodes have been found, then this box is discarded. 

(vi) For the nonempty boxes B, we define their measure as /u(-B) = Nb/N, where Nb is the 
number of nodes covered by the box B, and N is the number of nodes of the entire network. 

(vii) Repeat (iv) until all nodes are assigned to their respective boxes. 

(viii) When the process of box counting is finished, we calculate the partition sum as Zr{q) = 
S^(5)^o[/^(-^)]'^ for each value of r. 

(ix) Repeat (iii) and (iv) for all the random sequences, and take the average of the partition 
sums Zr{q) = (Y^^ Zr{q))/T, and then use Zr{q) for linear regression. 

Linear regression is an essential step to get the appropriate range of r G [rmin,rmax] and 
to get the generalized fractal dimensions Dq. In our approach, we run the linear regression of 
\ln Zr{q)]/{q — 1) against ln(r/d) for q ^ 1, and similarly the linear regression of Zi ,, against 
\n{r/d) for q = 1, where Zi^r = '^ij.{B)^ofJ'iB)ln n{B) and d is the diameter of the network. An 
example of linear regression for the Arabidopsis thaliana PPI network is shown in Fig. 3. The 
numerical results show that the best fit occurs in the range r G (1,9), hence we select this range 
to perform multifractal analysis and get the spectrum of generalized dimensions Dq. 

After this spectrum has been obtained, we use AD{q) = max D{q) — limD(g) to verify how 
Dq changes along each curve. The quantity AD{q) has been used in the literature to describe 
the density of an object. In this paper, based on our modified fixed-size box covering method, 
AD(q) can help to understand how the edge density changes in the complex network. In other 
words, a larger value of AD{q) means the edge distribution is more uneven. More specifically, 
for a network, edge distribution could vary from an area of hubs where edges are dense to an 
area where nodes are just connected with a few links. 

In the following sections, we calculate the generalized fractal dimensions Dq. Prom the shape 
of Dq, we determine the multifractality of the network using the method described above. We 
then calculate AD{q) to verify how Dq changes along each curve. 

3 Results and discussions 

In recent years, with the development of technology, the research on networks has shifted 
away from the analysis of single small graphs and the properties of individual vertices or edges 
within such graphs to consideration of large-scale statistical properties of complex networks. 
Newman [37] reviewed some latest works on the structure and function of networked systems 
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such as the Internet, the World Wide Web, social networks and a variety of biological networks. 
Besides reviewing empirical studies, the author also focused on a number of statistical properties 
of networks including path lengths, degree distributions, clustering and resilience. In this paper, 
we pay attention to another aspect of networks, namely their multifractality. We aim to develop 
a tool based on this property to characterize and classify real-world networks. 

It has been shown that many real complex networks share distinctive characteristics that 
differ in many ways from random and regular networks [21 [3l [37]. Fundamental properties of 
complex networks such small-world effect and the scale-free degree distribution have attracted 
much attention recently. These properties have in fact been found in many naturally occurring 
networks. In Subsections 3.1, 3.2 and 3.3, we generate scale- free networks using the BA model 
of Barabasi and Albert [38], small- world networks using the NW model of Newman and Watts 
[39] ■ then random networks using the ER model of Erdos and Renyi [3] respectively. We then 
apply our modified fixed-size box counting algorithm to analyze the multifractal behavior of 
these networks. 



3.1 Scale- free networks 



We use the elegant and simple BA model of Barabasi and Albert [38J to generate scale-free 
networks. The origin of the scale-free behavior in many systems can be traced back to this BA 
model, which correctly predicts the emergence of scaling exponent. The BA model consists of 
two mechanisms : Initially, the network begins with a seed network of n nodes, where n > 2 
and the degree of each node in the initial network should be at least 1, otherwise it will always 
remain disconnected from the rest of the network. For example, here we start with an initial 
network of 5 nodes. Its interaction matrix is 

/ 1 1 \ 
10 10 
1 
110 
\ 1 / 

We then add one node to this initial network at a time. Each new node is connected to n 
existing nodes with a probability that is proportional to the number of links that the existing 
nodes already have. Formally, the probability pi that the new node is connected to node i is 



Pi 



(9) 



where ki is the degree of node i. So hubs tend to quickly accumulate even more links, while 
nodes with only a few links are unlikely to be chosen as destination for a new link. 
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In this paper, these scale- free networks are generated based on the same seed which is the 
initial network of 5 nodes. For better comparison, in each step, one node will be added into the 
network with one link. Then we apply the modified fixed-size box counting method on them to 
detect their multifractal behavior. 

In Fig. 4. we can see that scale-free networks are multifractal by the shape of the Dg curves. 
The Dq functions of these networks decrease sharply after the peak. An explanation is that, in 
a scale-free network, there are several nodes which are known as hubs that have a large number 
of edges connected to them, so the edge density around the areas near the hubs is larger than 
the remaining parts of the network. 

We summarize the numerical results in Table 1 including the number of nodes, number of 
edges, diameter, power law exponent 7, maximum value of Dq, limit of Dq, and ADq. Prom 
these results we could see that scale free networks with larger size (more nodes and more edges) 
are likely to have larger values of the maximum and limit of Dq. In other words, the function 
Dq increases with the size of a scale-free network. An explanation for this situation is that 
larger scale- free networks usually have more hubs which make the structure of the network more 
complex. 

Scale-free networks show a power-law degree distribution of P{k) ~ k~"^, where P{k) is the 
probability of a node randomly chosen with degree k. It was shown in [6, 7] that when 7 < 2, 
the average degree diverges; while for 7 > 3, the standard deviation of the degree diverges. It 
has been found that the degree exponent 7 usually varies in the range of 2 < 7 < 3 [U] for most 
scale-free networks. Accordingly, we computed the power-law exponent of these generated scale- 
free networks. The results show that there doesn't seem to be any clear relationship between 
power law and the maximum of Dq, limit of Dq or ADq. 

3.2 Small- world networks 

In 1998, Watts and Strogatz |4U] proposed a single-parameter small-world network model 
that bridges the gap between a regular network and a random graph. With the WS small-world 
model, one can link a regular lattice with pure random network by a semirandom network with 
high clustering coefficient and short average path length. Later on, Newman and Watts [39] 
modified the original WS model. In the NW model, instead of rewiring links between nodes, 
extra links called shortcuts are added between pairs of nodes chosen at random, but no links are 
removed from the existing network. The NW model is equivalent to the WS model for small p 
and sufficiently large N, but easier to proceed. 

In this paper, we use the NW model as follows. Firstly, we should select three parameters: 
the dimension n, which is the number of nodes in a graph; the mean degree k (assumed to be 
an even integer), which is the number of nearest-neighbors to connect; and the probability p of 
adding a shortcut in a given row, where < p < 1 and k ^ ln(n) ^ 1. Secondly, we follow 
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two steps: 

(i) Construct a regular ring lattice. For example, if the nodes are named Nq, Ar„_i, there 
is an edge Sij between node iVj and Nj if and only if |z — j| = K ioi K E [0,k/2]; 

(ii) Add a new edge between nodes iVj and Nj with probability p. 

An illustration of this generating process is given in Fig. 5. The upper left figure corresponds 
to the probability p = 0. It is a regular network containing 20 nodes and each node has two 
neighbors on both sides. In other words, in this regular network, each node has four edges. All 
the nodes and edges are shown in blue. Then we start generating small- world networks based 
on this regular network. The upper right figure of Fig. 5 corresponds to the probability p = 0.1; 
one edge is added into the network which is colored in black. The network then becomes a 
small-world network. The bottom left figure corresponds to the probability p = 0.5; seven black 
edges are added into the original regular network and it is also a small-world network. The 
bottom right figure corresponds to the probability p = 1; 10 black edges are added into the 
original small-world network and this time it becomes a random network. 

In this paper, we firstly generated a regular network which contains 5000 nodes and 250,000 
edges. Each node has 50 edges on each side. Then we apply the modified fixed-size box counting 
method on this regular network. The numerical results are shown in the last row of Table 2. Both 
the maximum value of Dq and the limit of Dq are equal to one, thus A.Dq = 0. This is because 
regular networks are not fractal, and they have dimension one. Secondly, for better comparison, 
we generated ten small-world networks based on a regular network of 5000 nodes with 5 edges 
on each side of a node. During the generation, when the probability p increases, more edges are 
added into the original regular network. Then we apply the modified fixed-size box counting 
method on them to detect their multifractal behavior. We summarize the numerical results in 
Table 2, which includes the number of nodes, number of edges, diameter, probability p (the 
generating parameter), maximum value of Dq and ADq. These results indicate that, when p 
increases, more edges are added and accordingly both the maximum and limit values of Dq 
increase. 

In Fig. 6 we can see that the Dq curve of a regular network whose probability p = during 
generation is a straight line with the value of 1 . The Dq curves of the other small- world networks 
are also approximately straight lines but with different Dg values. So these networks are not 
multifractal. Another interesting property is apparent when 0.03 < p < 0.2, in which case Dq 
increases along with the value of p. More specifically, when p increases, more edges are added 
to the network, and both the maximum and limit values of Dq and limit of Dq increase. The 
values of ADq are all within the error range, confirming that the Dq curves are straight lines. 
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3.3 Random networks 

The Erdos-Renyi random graph model ^ is the oldest and one of the most studied techniques 
to generate complex networks. 

We generate random networks based on the ER model [1]: 

(i) Start with isolated nodes; 

(ii) Pick up every pair of nodes and connect them by an edge with probability p. 

Usually, the results of this generation are separated subnetworks. In this work, we just 
consider the largest connected part as the network to work on and apply the modified fixed-size 
box counting method to detect their multifractal behaviors. We then summarize the numerical 
results in Table 3 including the number of nodes, number of edges, diameter, probability p (the 
generating parameter), maximum value of Dq, limit of Dq, and ADg. These results indicate 
that there is no clear relationship between Dq and the size of the random network. 

In Fig. 7, we can see that the Dq curves of random networks decrease slowly after the peak 
and the changes could be seen by the values of ADq. This pattern occurs bcause, during the 
generating process, nodes are randomly connected with probability p, and few hubs may exist. 
Compared with scale-free networks, this decrease supports the claim that, in random networks, 
edges are distributed more symmetrically. 

Remark: In the present study, we consider the generalized fractal dimensions Dq to deter- 
mine whether the object is multifactal from the shape of Dq. For a monofractal system, which 
has the same scaling behavior at any point, Dq should be a constant independent of q, while for 
a multifractal, the Dq should be a non-increasing nonlinear curve as q increases. However, in 
our results, an anomalous behavior is observed: the Dq curves increase at the beginning when 
q < 0. This anomalous behavior has also been observed in Bos et al.|41j. Smith and Lange 
|42j . Fernandez et al. |33]. Some reasons for this behavior have been suggested, including that 
the boxes contain few elements [l3], or the small scaling regime covers less than a decade so 
that we cannot extrapolate the box counting results for the partition function to zero box size 
|41j . In encountering the anomalous spectra of Dq, we tried another method of multifractal 
analysis called the sand-box method, but the linear regression fittings are not satisfactory. We 
therefore used the modified fixed-size box counting algorithm in this research. For the purpose 
of detecting the multifractality of complex networks, we adopt the anomalous spectra of Dq and 
focus on the decreasing parts which are presented in Figs. 4 to 8. 

3.4 Protein-protein interaction networks 

Our fractal and multifractal analyses are based on connected networks which do not have 
separated parts or isolated nodes. In order to apply them to protein-protein interaction (PPI) 
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networks, some preparation is needed in advance. Firstly, we need to find the largest connected 
part of each data set. For this purpose many tools and methods could be used. In our study, 
we adopt the Cytoscape [H] which is an open bioinformatics software platform for visualizing 
molecular interaction networks and analyzing network graphs of any kind involving nodes and 
edges. In using Cytoscape, we could get the largest connected part of each interacting PPI 
data set and this connected part is the network on which fractal and multifractal analyses are 
performed. 

The protein-protein interaction data we used here are mainly downloaded from two databases: 
The PPI networks of Drosophila melanogaster (fruit fly), C. elegans, Arabidopsis thaliana and 
Schizosaccharomyces pombe are downloaded from BioGRID |45) . The PPI networks of S. cere- 
visiae (baker's yeast), E. coli and H. pylori are download from DIP [l6]. We also use the same 
human PPI network data as in |47) . 

We calculated the Dq spectra for eight PPI networks of different organisms as shown in Fig. 
8. From these Dq curves, we see that all PPI networks are multifractal and there are two clear 
groupings of organisms based on the peak values of their Dq curves. The first group includes 
human, Drosophila melanogaster, S. cerevisiae, and C. elegans. The second group just includes 
two bacteria E.coli and H. pylori. We also see that the PPI networks of the eight organisms 
have similar shape for the Dq curves. They all increase when q € [0,1], and reach their peak 
values around q = 2, then decrease sharply as q > 2 and finally reach their limit value when 
q > 10. So we can take lim D{q) = D{20) and use AD{q) = max D{q) — lira D{q) to verify how 
the Dq function changes along each curve. We summarize the corresponding numerical results 
in Table 4. 

4 Conclusions 

After analyzing a variety of real complex networks. Song et al. [Ij found that they consist of 
self-repeating patterns on all length scales, i.e., complex networks have self-similar structures. 
They found that the box-counting method is a proper tool to unfold the self-similar properties 
of complex networks and to further investigate network properties. 

However, describing objects by a single fractal dimension is a limitation of fractal analysis, 
especially when the networks exhibit a multifractal behavior. Multifractal analysis is a useful 
way to characterize the spatial heterogeneity of both theoretical and experimental fractal pat- 
terns. It allows the computation of a set of fractal dimensions, especially the generalized fractal 
dimensions Dq. 

A modified algorithm for analyzing the multifractal behavior of complex networks is proposed 
in this paper. This algorithm is applied on generated scale-free networks, small-world networks 
and random networks as well as protein-protein interaction networks. The numerical results 
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indicate that multifractality exists in scale-free networks and PPI networks, while for small- 
world networks and random networks their multifractality is not clear-cut, particularly for small- 
world networks generated by the NW model. Furthermore, for scale-free networks, the values 
of Dq increase when the size of the network increases because larger scale-free networks usually 
have more hubs which make the structure of the network more complex. However, for random 
networks there is no clear relationship between Dq and the size of the network. The quantity 
AD(q') = max£)(g) — limD(g) has been used to investigate how Dq changes. Larger AD(q') 
means the network's edge distribution is more uneven; while smaller AD(g) means the network's 
edge distribution is more symmetrical, which is the case for random networks. 

These results support that the algorithm proposed in this paper is a suitable and effective 
tool to perform multifractal analysis of complex networks. Especially, in conjunction with the 
derived quantities from Dq, the method and algorithm provide a needed tool to cluster and 
classify real networks such as the protein-protein interaction networks of organisms. 
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Table 1: Comparison of different scale- free networks 



Number of nodes 


Number of edges 


Diameter 


7 


Max(Dq) 


Lim(Dq) 


ADq 


500 


499 


13 


1.94 ± 0.02 


2.67 


1.36 


1.31 


1000 


999 


16 


2.02 ± 0.07 


2.93 


1.47 


1.46 


1500 


1499 


17 


2.09 ± 0.04 


2.96 


1.65 


1.30 


2000 


1999 


20 


1.99 ± 0.08 


3.05 


1.76 


1.29 


3000 


2999 


20 


2.06 ± 0.04 


3.26 


1.83 


1.44 


4000 


3999 


23 


2.09 ± 0.03 


3.32 


1.80 


1.52 


5000 


4999 


23 


2.08 ± 0.04 


3.26 


1.75 


1.51 


6000 


5999 


22 


2.06 ± 0.04 


3.39 


1.88 


1.51 


7000 


5999 


28 


2.08 ± 0.04 


3.39 


2.10 


1.29 


8000 


5999 


25 


1.91 ± 0.12 


3.33 


2.11 


1.22 



Table 2: Comparison of different small-world networks and regular networks with 5000 nodes 



Number of nodes 


Number of edges 


Diameter 


P 


Max(Dq) 


Lim(Dq) 


ADq 


5000 


25159 


33 


0.03 


2.31 


2.28 


0.03 


5000 


25207 


29 


0.04 


2.43 


2.37 


0.06 


5000 


25290 


23 


0.06 


2.56 


2.53 


0.03 


5000 


25358 


23 


0.08 


2.66 


2.63 


0.03 


5000 


25513 


18 


0.1 


2.81 


2.75 


0.06 


5000 


25621 


15 


0.13 


2.89 


2.83 


0.06 


5000 


25792 


15 


0.15 


2.99 


2.93 


0.06 


5000 


26017 


12 


0.2 


3.08 


3.04 


0.04 


regular network 5000 


250000 


50 





1 


1 


0.00 



Table 3: Comparison of different random networks 



Number of nodes 


Number of edges 


Diameter 


P 


Max(Dq) 


Lim(Dq) 


ADq 


449 


610 


15 


0.005 


2.42 


2.14 


0.28 


994 


2502 


8 


0.005 


3.32 


2.87 


0.45 


1991 


5939 


9 


0.003 


3.73 


3.41 


0.32 


2484 


6310 


11 


0.002 


3.70 


3.33 


0.37 


2790 


4374 


18 


0.001 


3.29 


2.95 


0.34 


3373 


5978 


15 


0.001 


3.47 


3.15 


0.32 


3931 


8125 


13 


0.001 


3.67 


3.35 


0.31 


4919 


10179 


13 


0.0008 


3.78 


3.39 


0.39 


5620 


8804 


16 


0.00058 


3.54 


3.21 


0.33 
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Table 4: Comparison of different PPI networks 



Networks 


Number of nodes 


Number of edges 


Diameter 


Do 


Max(Dq) 


Lim(Dq) 


ADq 


Human 


8934 


41341 


14 


2.34 


4.89 


2.65 


2.24 


Drosophila Melanogaster 


7476 


26534 


11 


2.34 


4.84 


2.87 


1.97 


S. cerevisiae 


4976 


21875 


10 


2.36 


4.62 


2.48 


2.14 


E.coli 


2516 


11465 


12 


2.14 


4.15 


2.10 


2.05 


H. pylori 


686 


1351 


9 


2.27 


3.47 


1.91 


1.56 


C.elegans 


3343 


6437 


13 


2.28 


4.47 


1.49 


2.98 


Arabidopsis Thaliana 


1298 


2767 


25 


1.83 


2.51 


1.62 


0.89 




Figure 1: Fractal scaling of the human PPI network (o) and its skeleton (.). The fractal dimen- 
sion is the absolute value of the slope of each linear fit, which is 2.20 it 0.09 for the original 
network and 2.07 it 0.09 for its skeleton. 
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Figure 2: The traditional box-counting algorithm may result in different numbers of boxes 
needed to cover the entire network. 
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Figure 4: The Dq curves for theoretically generated scale-free networks. 
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Figure 6: The Dq curves for theoretically generated small-world networks. 
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